Search CORE

9 research outputs found

Feature Selection by Singular Value Decomposition for Reinforcement Learning

Author: Behzadian Bahram
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/05/2019
Field of study

Solving reinforcement learning problems using value function approximation requires having good state features, but constructing them manually is often difficult or impossible. We propose Fast Feature Selection (FFS), a new method for automatically constructing good features in problems with high-dimensional state spaces but low-rank dynamics. Such problems are common when, for example, controlling simple dynamic systems using direct visual observations with states represented by raw images. FFS relies on domain samples and singular value decomposition to construct features that can be used to approximate the optimal value function well. Compared with earlier methods, such as LFD, FFS is simpler and enjoys better theoretical performance guarantees. Our experimental results show that our approach is also more stable, computes better solutions, and can be faster when compared with prior work

UNH Scholars' Repository

Efficient Data-Driven Robust Policies for Reinforcement Learning

Author: Behzadian Bahram
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/05/2022
Field of study

Applying the reinforcement learning methodology to domains that involve risky decisions like medicine or robotics requires high confidence in the performance of a policy before its deployment. Markov Decision Processes (MDPs) have served as a well-established model in reinforcement learning (RL). An MDP model assumes that the exact transitional probabilities and rewards are available. However, in most cases, these parameters are unknown and are typically estimated from data, which are inherently prone to errors. Consequently, due to such statistical errors, the resulting computed policy\u27s actual performance is often different from the designer\u27s expectation. In this context, practitioners can either be negligent and ignore parameter uncertainty during decision-making or be pessimistic by planning to be protected against the worst-case scenario. This dissertation focuses on a moderate mindset that strikes a balance between the two contradicting points of view. This objective is also known as the percentile criterion and can be modeled as risk-aversion to epistemic uncertainty. We propose several RL algorithms that efficiently compute reliable policies with limited data that notably improve the policies\u27 performance and alleviate the computational complexity compared to standard risk-averse RL algorithms. Furthermore, we present a fast and robust feature selection method for linear value function approximation, a standard approach to solving reinforcement learning problems with large state spaces. Our experiments show that our technique is faster and more stable than alternative methods

UNH Scholars' Repository

Robot Localization with Weak Maps

Author: Behzadian Bahram
Publication venue
Publication date: 09/10/2013
Field of study

In this work, we present an approach for indoor localization for a mobile robot based on a weakly-defined prior map. The aim is to estimate the robot's pose where even an incomplete knowledge of the environment is available, and furthermore, improving the information in the prior map according to measurements. We discuss two different approaches to describe the prior map. In the first approach, a complete map of the environment is given to the robot, but the scale of the map is unknown. The map is represented by occupancy grid mapping. We present a method based on Monte Carlo localization that successfully estimates the robot's pose and the scale of the map. In the second approach, the prior map is a 2D sketched map provided by a user, and it does not hold exact metric information of the building. Moreover, some obstacles and features are not fully presented. The aim is to estimate the scale of the map and to modify and correct the prior map knowing the robot's exact pose. The map is represented in the polygonal format in the homogeneous coordinates, and is capable of analyzing the uncertainty of features. We propose two methods to update prior information in the map. One uses a Kalman filter, and the other is based on Geometrical Constraints. Both methods can partially improve the estimate of the dimensions of rooms and locations and the orientation of walls, but they slightly suffer from data association

Trepo - Institutional Repository of Tampere University

Monte Carlo Localization in Hand-Drawn Maps

Author: Agarwal Pratik
Behzadian Bahram
Burgard Wolfram
Tipaldi Gian Diego
Publication venue
Publication date: 02/04/2015
Field of study

Robot localization is a one of the most important problems in robotics. Most of the existing approaches assume that the map of the environment is available beforehand and focus on accurate metrical localization. In this paper, we address the localization problem when the map of the environment is not present beforehand, and the robot relies on a hand-drawn map from a non-expert user. We addressed this problem by expressing the robot pose in the pixel coordinate and simultaneously estimate a local deformation of the hand-drawn map. Experiments show that we are able to localize the robot in the correct room with a robustness up to 80

arXiv.org e-Print Archive

Crossref

Robot Localization with Weak Maps

Author: Behzadian Bahram
Publication venue
Publication date: 09/10/2013
Field of study

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

TUT DPub

Fast Feature Selection for Linear Value Function Approximation

Author: Behzadian Bahram
Gharatappeh Soheil
Petrik Marek
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 25/05/2021
Field of study

Linear value function approximation is a standard approach to solving reinforcement learning problems with large state spaces. Since designing good approximation features is difficult, automatic feature selection is an important research topic. We propose a new method for feature selection that is based on a low-rank factorization of the transition matrix. Our approach derives features directly from high-dimensional raw inputs, such as image data. The method is easy to implement using SVD, and our experiments show that it is faster and more stable than alternative methods

Association for the Advancement of Artificial Intelligence: AAAI Publications